A Lot on Our Tectonic Plates

TerpQuakes (Group 10): Hibah Khan, Akshad Thole, Mohit Jain, Hitaishi Joshi, Pranav Karmarkar, Yash Gadhiya,
Natasha Pashupathi

Section 1 - Introduction :¶

Project Overview -¶

Our project, "A Lot on Our Tectonic Plates," delves into seismic activity and its profound implications, aiming to enhance our understanding of earthquakes and contribute to a safer world. We aim to understand the causes, effects, and patterns of earthquakes by utilizing data, technology, and scientific expertise. We aspire to enhance our comprehension of earthquake behavior and its impact on various industry sectors, ultimately striving to create a safer and more resilient world in the face of this powerful natural phenomenon.

Motivation -¶

Earthquakes, unpredictable and destructive, shape landscapes and impact communities. The project explores earthquake behavior, emphasizing contextual factors like tectonic plate movements and fault line activity. Our motivation stems from earthquakes being among the deadliest disasters, causing extensive structural and economic damage. For instance, the earthquake in Turkey and Syria in February 2023 resulted in a staggering 56,259 deaths and economic losses ranging from US 10 billion dollars to US 100 billion dollars. Recovering from such a massive loss would undoubtedly take these countries years. Earthquakes not only trigger secondary events like tsunamis and landslides but also leave lasting economic scars on affected regions. By delving into earthquake patterns and comprehensively analyzing their economic impacts, we seek to minimize future industrial losses significantly. This information is critical for advocating more effective precautionary measures, enhancing disaster preparedness, and contributing to the development of more resilient societies in earthquake-prone regions.


Questions of Interest -¶

Our selection of questions of interest is grounded in the overarching goal of our project. Each question was chosen based on its significance in addressing different facets of earthquake occurrence, from natural causes to human-induced activities.

  1. Trend in Earthquake Activity Over Time :
    Significance: Recognizing patterns in earthquake occurrence helps in long-term preparedness and resource planning.
    Impact: Identifying trends over time informs whether seismic activity is increasing, aiding in disaster mitigation strategies.
    Expectation: Anticipating an increasing trend in earthquake frequency over time, influenced by technological advancements and potential climate change impacts. Note: Dataset limitations may skew findings towards higher magnitude earthquakes.

  2. Mining and Nuclear Explsions Impact on Earthquakes :
    Significance: Addressing human-induced seismicity is essential for responsible resource extraction.
    Impact: Knowing the correlation between mining/blasting and seismic activity contributes to sustainable practices and risk reduction.
    Expectation: Overall, a correlation between mining/blasting and increased seismic activity is expected, although limitations in our dataset focusing on earthquakes above magnitude 4 may not fully capture mining-induced seismic events.

  3. Nuclear Power Plants and Earthquake Impact :
    Significance: Understanding the potential connection between nuclear activities and seismic events is critical for safety.
    Impact: Findings guide safety measures around nuclear plants, ensuring minimal risk of earthquakes triggered by such activities.
    Expectation: Foreseeing a link between nuclear explosions and seismic activity, but the relationship with earthquake magnitude remains unclear.

  4. Regions Affected by Severe Earthquakes :
    Significance: Prioritizing earthquake-prone regions is crucial for effective disaster preparedness.
    Impact: Understanding the most affected areas informs resource allocation and drills, aiding vulnerable communities.
    Expectation: The Ring of Fire is anticipated to experience the most severe earthquakes due to its tectonic activity, guiding prioritization of drills and resource allocation.

In summary, these questions were chosen to address the diverse aspects of earthquake occurrence, combining natural and human-induced factors. The anticipated findings aim to provide actionable insights for disaster preparedness, risk reduction, and the development of resilient communities in earthquake-prone regions.


Dataset Description -¶

The Dataset is obtained from the USGS Earthquake website (USGS), covering global earthquake data from 2000 to 2023 with 321,789 observations, 22 variables including time, location, depth, magnitude, and event type.

The variables in the dataset and their descriptions are as follows:

Column Name Description
time The time at which the earthquake occurred
latitude The latitude of the earthquake
longitude The longitude of the earthquake
depth Depth of the event in kilometers
mag Richter Magnitude of the earthquake
magType The method or algorithm used to calculate the preferred magnitude for the event
nst The total number of seismic stations used to determine earthquake location
gap The largest distance between adjacent seismic stations measuring the earthquake (in degrees)
dmin Horizontal distance from the epicenter to the nearest station (in degrees)
rms Measure of the observed arrival times to the predicted arrival times for this location
id A unique identifier for the event
updated Time when the event was most recently updated
net The ID of a data contributor
place Geographic region near the event
type Type of seismic event (e.g., blasting, earthquake, explosion, etc.)
horizontalError Uncertainty of reported location of the event in kilometers
depthError Uncertainty of reported depth of the event in kilometers
magError Uncertainty of reported magnitude of the event. The estimated standard error of the magnitude
magNst The total number of seismic stations used to calculate the magnitude for this earthquake
status Status is either automatic or reviewed. Automatic events are directly posted by automatic processing systems and have not been verified or altered by a human
locationSource The network that originally authored the reported location of this event
magSource Network that originally authored the reported magnitude for this event


Section 2 - Choice for Heavier Grading on Data Processing or Data Analysis :¶

We choose Data Analysis as the primary focus for grading due to the intricate nature of our earthquake study. While proficient data processing is essential, the complexity of seismic data necessitates advanced analytical techniques to uncover meaningful patterns, correlations, and insights. Our project goes beyond routine data processing by employing sophisticated methods to decipher nuanced trends, contributing to a deeper understanding of earthquake occurrences and their implications for disaster preparedness and mitigation.

Furthermore, our emphasis on data analysis is justified by the multifaceted questions of interest, which require nuanced interpretations and correlations between seismic events and various influencing factors. The intricate exploration of trends over time, human activities, and geographical patterns demands a sophisticated analytical approach, showcasing the project's commitment to deriving comprehensive and valuable insights from the dataset.



Section 3 - Data Processing :¶

As the dataset considered is huge and consists of different anomalies, we will first need to determine the different ways to clean the data. In order to do that, let us understand how our dataset looks. We will be importing the below python libraries and using them throughout the project for various tasks.

In [1]:
# The numpy and pandas libraries are used for data processing and cleaning
# Matplotlib and Seaborn are use to create visualizations 
# the re (regex) is used for text processing
# datetime and dateutil.parser converts string to datetime objects
# basemap and folium are used to create map plots

import pandas as pd
import numpy as np
from numpy import nan as NA
import datetime as dt
from dateutil.parser import parse
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import folium
from geopy.distance import geodesic
from folium.plugins import MarkerCluster
from folium import IFrame
from mpl_toolkits.basemap import Basemap
import re

The first step is to load the data from the file into a data frame and have a good look at it.

In [2]:
# readcsv method will create a dataframe to enable us to use the excel file for Analysis
earthquake = pd.read_csv('Earthquake_2000_to_2023 Original.csv')

# head() method will display first five rows of the dataset
earthquake.head()
Out[2]:
time latitude longitude depth mag magType nst gap dmin rms ... updated place type horizontalError depthError magError magNst status locationSource magSource
0 2000-12-31T23:50:35.500Z 52.317 160.552 33.0 4.3 mb 14.0 NaN NaN 1.16 ... 2014-11-07T01:11:48.035Z 151 km ESE of Petropavlovsk-Kamchatsky, Russia earthquake NaN NaN NaN 3.0 reviewed us us
1 2000-12-31T23:42:59.690Z -26.547 -107.261 10.0 5.4 mwc 80.0 NaN NaN 0.84 ... 2016-11-10T00:20:01.175Z 225 km ENE of Hanga Roa, Chile earthquake NaN NaN NaN NaN reviewed us hrv
2 2000-12-31T22:07:00.300Z -38.530 178.930 91.0 4.0 ml 11.0 NaN NaN NaN ... 2014-11-07T01:11:48.020Z 81 km E of Gisborne, New Zealand earthquake NaN NaN NaN NaN reviewed wel wel
3 2000-12-31T21:56:50.900Z -38.040 178.800 33.0 5.3 mwc 58.0 NaN NaN NaN ... 2022-04-29T19:03:06.871Z 97 km NE of Gisborne, New Zealand earthquake NaN NaN NaN NaN reviewed wel hrv
4 2000-12-31T16:12:34.400Z -56.179 -27.217 100.0 4.5 mb 14.0 NaN NaN 0.54 ... 2014-11-07T01:11:48.004Z South Sandwich Islands region earthquake NaN NaN NaN 2.0 reviewed us us

5 rows × 22 columns

In [3]:
# Visualizing the last five rows of data
earthquake.tail()
Out[3]:
time latitude longitude depth mag magType nst gap dmin rms ... updated place type horizontalError depthError magError magNst status locationSource magSource
321784 2022-10-10T00:54:02.984Z 52.1613 -170.4070 45.180 4.8 mwr 123.0 123.0 1.815 0.78 ... 2022-12-17T22:53:48.040Z 135 km SW of Nikolski, Alaska earthquake 7.25 5.613 0.075 17.0 reviewed us us
321785 2022-10-10T00:46:57.010Z -58.7992 -24.1658 10.000 4.3 mb 12.0 179.0 8.171 0.74 ... 2022-12-17T22:54:09.040Z South Sandwich Islands region earthquake 12.14 1.924 0.168 10.0 reviewed us us
321786 2022-10-10T00:44:47.962Z 54.0830 -35.0767 10.000 4.6 mb 78.0 57.0 8.996 0.43 ... 2022-12-17T22:54:08.040Z Reykjanes Ridge earthquake 9.80 1.871 0.071 59.0 reviewed us us
321787 2022-10-10T00:26:17.442Z 38.8416 142.1723 45.627 4.6 mwr 118.0 122.0 2.124 0.55 ... 2022-12-17T22:53:48.040Z 47 km ESE of ?funato, Japan earthquake 7.96 5.771 0.063 24.0 reviewed us us
321788 2022-10-10T00:02:59.585Z 42.0966 144.1664 35.000 4.9 mb 143.0 74.0 0.756 0.58 ... 2022-12-17T22:53:48.040Z Hokkaido, Japan region earthquake 3.78 1.824 0.044 160.0 reviewed us us

5 rows × 22 columns

Exploratory Data Analysis -¶

Now, for us to understand what data should be cleaned, we need to perform exploratory data analysis.

Here, we will

  1. Determine the shape of the dataset
  2. Address the type of data contained in it.
  3. Check if there are any missing values or null values, if yes, then how many?
  4. Describe various statistical summary measures of the dataset for all the columns.
  5. Check if there are any duplicate values, how to remove those?
In [4]:
# Determines the number of rows and columns present in the datset
earthquake.shape
Out[4]:
(321789, 22)
In [5]:
# Gives an overview of the type of data
earthquake.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 321789 entries, 0 to 321788
Data columns (total 22 columns):
 #   Column           Non-Null Count   Dtype  
---  ------           --------------   -----  
 0   time             321789 non-null  object 
 1   latitude         321789 non-null  float64
 2   longitude        321789 non-null  float64
 3   depth            321789 non-null  float64
 4   mag              321789 non-null  float64
 5   magType          321789 non-null  object 
 6   nst              185681 non-null  float64
 7   gap              286434 non-null  float64
 8   dmin             150241 non-null  float64
 9   rms              301990 non-null  float64
 10  net              321789 non-null  object 
 11  id               321789 non-null  object 
 12  updated          321789 non-null  object 
 13  place            319870 non-null  object 
 14  type             321789 non-null  object 
 15  horizontalError  136798 non-null  float64
 16  depthError       215716 non-null  float64
 17  magError         149854 non-null  float64
 18  magNst           275295 non-null  float64
 19  status           321789 non-null  object 
 20  locationSource   321789 non-null  object 
 21  magSource        321789 non-null  object 
dtypes: float64(12), object(10)
memory usage: 54.0+ MB
In [6]:
# describe method will provide descriptive statistical mesures 
earthquake.describe()
Out[6]:
latitude longitude depth mag nst gap dmin rms horizontalError depthError magError magNst
count 321789.000000 321789.000000 321789.000000 321789.000000 185681.000000 286434.000000 150241.000000 301990.000000 136798.000000 215716.000000 149854.000000 275295.000000
mean 3.790758 38.417872 79.594908 4.536560 55.856313 112.002455 3.898871 0.873939 8.596409 9.630159 0.128768 28.063746
std 29.099115 120.665175 129.250402 0.420547 76.136256 52.375836 5.040551 0.302056 3.554687 795.508198 0.071220 48.633972
min -84.422000 -179.999700 -3.290000 3.380000 0.000000 6.500000 0.000000 -1.000000 0.000000 -1.000000 0.000000 0.000000
25% -18.020600 -72.032000 10.000000 4.200000 16.000000 73.000000 1.193000 0.690000 6.200000 1.900000 0.078000 6.000000
50% 0.859000 95.086500 33.000000 4.500000 29.000000 108.000000 2.443000 0.870000 8.200000 5.800000 0.118000 13.000000
75% 28.157400 141.746000 77.300000 4.700000 61.000000 141.000000 4.509000 1.050000 10.700000 9.300000 0.162000 29.000000
max 87.386000 180.000000 735.800000 9.100000 934.000000 358.300000 64.498000 69.320000 99.000000 367558.100000 1.680000 941.000000
In [7]:
# Gives number of null values for all columns
earthquake.isnull().sum()
Out[7]:
time                    0
latitude                0
longitude               0
depth                   0
mag                     0
magType                 0
nst                136108
gap                 35355
dmin               171548
rms                 19799
net                     0
id                      0
updated                 0
place                1919
type                    0
horizontalError    184991
depthError         106073
magError           171935
magNst              46494
status                  0
locationSource          0
magSource               0
dtype: int64
In [8]:
# Determining the percentage of null values out of all the values
(earthquake.isnull().sum())/(earthquake.shape[0])*100
Out[8]:
time                0.000000
latitude            0.000000
longitude           0.000000
depth               0.000000
mag                 0.000000
magType             0.000000
nst                42.297282
gap                10.987013
dmin               53.310710
rms                 6.152790
net                 0.000000
id                  0.000000
updated             0.000000
place               0.596354
type                0.000000
horizontalError    57.488292
depthError         32.963526
magError           53.430975
magNst             14.448598
status              0.000000
locationSource      0.000000
magSource           0.000000
dtype: float64

From the above tasks, we can clearly understand that there are a lot of null values for the columns nst, gap, dmin, horizontalerror, deptherror, magerror and magnst.

We are going to take up each of these columns and modify them as needed.


Data Cleaning -¶

Cleaning the data would help us remove duplicate values, irrelevant rows and impute meaningful values wherever needed. Below are some of the sub-tasks associated with this:

1. Filtering :¶

We have indexed the data based on each earthquake’s unique ID and to ensure each earthquake data is unique and not repeated, we have droped the duplicate values, if any.

In [9]:
# Drop duplicates

before=len(earthquake.index)
print(f"The number of rows before: {before}")
earthquake.drop_duplicates(inplace=True)
after=len(earthquake.index)
print(f"The number of rows after: {after}")
print(f"We have sucessfully removed {before-after} rows.")
The number of rows before: 321789
The number of rows after: 321360
We have sucessfully removed 429 rows.

Let us study what columns are present in our dataset and whether we need them for analysis or not.

In [10]:
earthquake.columns
Out[10]:
Index(['time', 'latitude', 'longitude', 'depth', 'mag', 'magType', 'nst',
       'gap', 'dmin', 'rms', 'net', 'id', 'updated', 'place', 'type',
       'horizontalError', 'depthError', 'magError', 'magNst', 'status',
       'locationSource', 'magSource'],
      dtype='object')

'updated', 'rms': These columns are not used in our analysis. Their presence does not contribute to the earthquake data interpretation or the results we aim to achieve.
'horizontalError': This column has approximately 57% null values, making it unreliable for precise analysis. The high proportion of missing data in this column could lead to skewed or inaccurate interpretations, hence its removal.

Thus, we will be dropping these columns


Lower magnitude earthquakes (less than 4) are typically not considered significant for data analysis because they are less likely to cause noticeable damage or human impact, making them less relevant for assessing seismic hazards.
Source: "USGS Earthquake Hazards Program - Magnitude" at https://earthquake.usgs.gov/earthquakes/eventpage/glossary#mag

In [11]:
# Drop columns
earthquake.drop(['updated', 'horizontalError', 'rms'], axis=1, inplace=True)
In [12]:
# Drop rows with mag lower than 4
before=len(earthquake.index)
print(f"The number of rows before are: {before}")
earthquake.drop(earthquake[earthquake.mag<4].index, inplace=True)
after=len(earthquake.index)
print(f"The number of rows after are: {after}")
print(f"We have removed {before-after} that had a magnitude lower than 4.")
The number of rows before are: 321360
The number of rows after are: 321359
We have removed 1 that had a magnitude lower than 4.

What can be considered as a significant earthquake, is one with a gap greater than 180 and magnitude less than 5.5 in earthquake data analysis will not help focus on more significant seismic events, as smaller earthquakes with wide gaps may not provide meaningful insights.

In [13]:
# Drop rows where gap is greater than 180 and magnitude is less than 5.5

print(f"The number of rows before is: {len(earthquake.index)}")
print(f"The number of rows we have dropped: {len(earthquake[(earthquake.gap > 180) & (earthquake.mag < 5.5)].index)}")
earthquake.drop(earthquake[(earthquake.gap > 180) & (earthquake.mag < 5.5)].index, inplace=True)
The number of rows before is: 321359
The number of rows we have dropped: 30990

We will further drop rows where dmin is greater than or equal to 10 and magnitude is less than 5.5 in earthquake data analysis. This can be justified based on the practice of excluding less relevant or distant seismic events.

Source: (USGS)

In [14]:
# Drop rows where dmin is greater than or equal to 10 and magnitude is less than 5.5

print(f"We have dropped {len(earthquake[(earthquake.dmin >= 10) & (earthquake.mag < 5.5)].index)} rows.")
earthquake.drop(earthquake[(earthquake.dmin >= 10) & (earthquake.mag < 5.5)].index, inplace=True)
print(f"The number of rows after filtering are: {len(earthquake.index)}")
We have dropped 9005 rows.
The number of rows after filtering are: 281364
In [15]:
# Drop rows where magerror is greater than 1 and depth is less than 0.6

print(f"We have dropped {len(earthquake[(earthquake.magError >= 1) & (earthquake.depth < 0.6)].index)} rows.")
earthquake.drop(earthquake[(earthquake.magError >= 1) & (earthquake.depth < 0.6)].index, inplace=True)
print(f"The number of rows after filtering are: {len(earthquake.index)}")
We have dropped 1 rows.
The number of rows after filtering are: 281363

2. Imputing :¶

To effectively manage missing values in our earthquake dataset, we apply a consistent method of filling these gaps with mean values for specific columns. This strategy is employed for the following reasons:

'nst' and 'magNst' Columns:
We address the 'nst' (number of stations) and 'magNst' (number of stations reporting magnitude) columns by inserting their mean values for missing data. We do this for retaining the dataset's overall statistical characteristics. It ensures that our imputed values realistically represent the typical scenario of seismic event observation, crucial for analysis where the quantity of observational data is significant.

'magError' Column:

In the 'magError' column, representing the error in magnitude measurements, filling in missing values with the mean provides a uniform method to account for measurement uncertainties. This uniformity is vital for ensuring that each record in our dataset maintains a standard error estimation, pivotal for accurate and consistent seismic analysis.
'gap', 'dmin', and 'depthError' Columns:
For technical measurements such as 'gap' (angular gap between seismic stations), 'dmin' (distance to the nearest station), and 'depthError' (error in the depth measurement), utilizing the mean value to fill nulls assists in maintaining the integrity of the dataset. This approach ensures that the filled values are in line with the general tendencies observed in these technical aspects, thus preserving the dataset's reliability for in-depth seismic studies.

This methodology ensures that our dataset remains comprehensive and robust, crucial for any analytical processes or insights we wish to derive regarding seismic activities.

In [16]:
columns_to_fill = ['nst', 'magNst', 'magError', 'gap', 'dmin', 'depthError']

# Fill missing values in each column with its mean
for column in columns_to_fill:
    earthquake[column].fillna(earthquake[column].mean(), inplace=True)

Here we have 23 distinct values in the magType column. To simplify the dataset for analysis,
We can create broader categories that group the magnitude types based on their method of calculation or the type of seismic waves they measure.

Let us understand how:

In [17]:
# Take out unique values of the column using the unique() method

distinct_magType = earthquake['magType'].unique()
print('Distinct Values in the magType Column - ', distinct_magType)
print(f'The number of distinct magnitude types in the column magType = {len(distinct_magType)}')
Distinct Values in the magType Column -  ['mb' 'mwc' 'ml' 'md' 'mwb' 'mw' 'ms' 'mblg' 'mwr' 'mww' 'mlg' 'mh' 'm'
 'mc' 'Mb' 'Md' 'mb_lg' 'Ml' 'ms_20' 'mlr' 'mwp' 'mlv' 'ml(texnet)' 'Mi']
The number of distinct magnitude types in the column magType = 24

We have categorised majorly into four types based on our research. Each of the below type determines a specific method through which the earthquake magnitude was caluculated.

Body Wave Magnitudes:
Category Name: "body wave"
Includes: mb, Mb, mb_lg, mblg

Surface Wave Magnitudes:
Category Name: "surface wave"
Includes: ms, ms_20

Moment Magnitudes:
Category Name: "moment"
Includes: mw, mwc, mwb, mwr, mww, mwp, ml, Ml, mlg

Duration Magnitudes:
Category Name: "duration"
Includes: md, Md, m, mh, mc, mlr, mlv, Mi

In [18]:
# Define the translation dictionary
magType_categories = {
    'body wave': ['mb', 'Mb', 'mb_lg', 'mblg'],
    'surface wave': ['ms', 'ms_20'],
    'moment': ['mw', 'mwc', 'mwb', 'mwr', 'mww', 'mwp', 'ml', 'Ml', 'mlg'],
    'duration': ['md', 'Md', 'm', 'mh', 'mc', 'mlr', 'mlv', 'Mi']
}

# Function to replace values in a column with the corresponding key from the dictionary
def replace_with_category(magType, magTypeDict):
    for category, values in magTypeDict.items():
        if magType in values:
            return category
    return magType


earthquake['magType'] = earthquake['magType'].apply(lambda magType: replace_with_category(magType, magType_categories))
In [19]:
earthquake['magType'].value_counts()
Out[19]:
magType
body wave       233890
moment           42054
duration          5224
surface wave       194
ml(texnet)           1
Name: count, dtype: int64

Now, Let's review the data after filtering and imputing.

In [20]:
# Displaying the top 10 rows of the dataset
earthquake.head(10)
Out[20]:
time latitude longitude depth mag magType nst gap dmin net id place type depthError magError magNst status locationSource magSource
0 2000-12-31T23:50:35.500Z 52.317 160.552 33.0 4.3 body wave 14.0 99.604162 2.865827 us usp000a708 151 km ESE of Petropavlovsk-Kamchatsky, Russia earthquake 9.909305 0.124552 3.000000 reviewed us us
1 2000-12-31T23:42:59.690Z -26.547 -107.261 10.0 5.4 moment 80.0 99.604162 2.865827 us usp000a707 225 km ENE of Hanga Roa, Chile earthquake 9.909305 0.124552 29.453822 reviewed us hrv
2 2000-12-31T22:07:00.300Z -38.530 178.930 91.0 4.0 moment 11.0 99.604162 2.865827 us usp000a705 81 km E of Gisborne, New Zealand earthquake 9.909305 0.124552 29.453822 reviewed wel wel
3 2000-12-31T21:56:50.900Z -38.040 178.800 33.0 5.3 moment 58.0 99.604162 2.865827 us usp000a704 97 km NE of Gisborne, New Zealand earthquake 9.909305 0.124552 29.453822 reviewed wel hrv
4 2000-12-31T16:12:34.400Z -56.179 -27.217 100.0 4.5 body wave 14.0 99.604162 2.865827 us usp000a6zz South Sandwich Islands region earthquake 9.909305 0.124552 2.000000 reviewed us us
5 2000-12-31T14:47:12.700Z -15.224 -173.797 115.1 4.5 body wave 51.0 99.604162 2.865827 us usp000a6zw 80 km N of Hihifo, Tonga earthquake 9.909305 0.124552 12.000000 reviewed us us
6 2000-12-31T13:04:57.260Z -19.416 -176.181 246.5 4.1 body wave 23.0 99.604162 2.865827 us usp000a6zr 196 km WNW of Pangai, Tonga earthquake 9.909305 0.124552 5.000000 reviewed us us
7 2000-12-31T12:56:12.900Z -12.901 166.822 50.4 4.8 body wave 38.0 99.604162 2.865827 us usp000a6zq 133 km NW of Sola, Vanuatu earthquake 9.909305 0.124552 7.000000 reviewed us us
8 2000-12-31T12:05:06.470Z -44.734 -79.654 33.0 4.5 body wave 15.0 99.604162 2.865827 us usp000a6zn Off the coast of Aisen, Chile earthquake 9.909305 0.124552 3.000000 reviewed us us
9 2000-12-31T08:56:39.780Z 44.653 147.918 100.0 4.5 body wave 20.0 99.604162 2.865827 us usp000a6zj 63 km S of Kuril’sk, Russia earthquake 9.909305 0.124552 2.000000 reviewed us us

3. Text Processing :¶

The date and time in the dataset are combined in a column, we will be extracting them into seperate columns.

In [21]:
# function to extract the date of the earthqauke from the time column ignoring any null values
def extract_date(time):
    if pd.notnull(time):
        parts = time.split('T')
        if len(parts) > 1:
            return parts[0].strip()
    return None

#create a new column date using the output of the previous function
earthquake['date'] = earthquake['time'].apply(extract_date)
In [22]:
# Parse string to datetime format
earthquake.time = earthquake.time.map(lambda x: parse(x))
In [23]:
# Displaying the top 5 rows with the new column 'Date'
earthquake.head()
Out[23]:
time latitude longitude depth mag magType nst gap dmin net id place type depthError magError magNst status locationSource magSource date
0 2000-12-31 23:50:35.500000+00:00 52.317 160.552 33.0 4.3 body wave 14.0 99.604162 2.865827 us usp000a708 151 km ESE of Petropavlovsk-Kamchatsky, Russia earthquake 9.909305 0.124552 3.000000 reviewed us us 2000-12-31
1 2000-12-31 23:42:59.690000+00:00 -26.547 -107.261 10.0 5.4 moment 80.0 99.604162 2.865827 us usp000a707 225 km ENE of Hanga Roa, Chile earthquake 9.909305 0.124552 29.453822 reviewed us hrv 2000-12-31
2 2000-12-31 22:07:00.300000+00:00 -38.530 178.930 91.0 4.0 moment 11.0 99.604162 2.865827 us usp000a705 81 km E of Gisborne, New Zealand earthquake 9.909305 0.124552 29.453822 reviewed wel wel 2000-12-31
3 2000-12-31 21:56:50.900000+00:00 -38.040 178.800 33.0 5.3 moment 58.0 99.604162 2.865827 us usp000a704 97 km NE of Gisborne, New Zealand earthquake 9.909305 0.124552 29.453822 reviewed wel hrv 2000-12-31
4 2000-12-31 16:12:34.400000+00:00 -56.179 -27.217 100.0 4.5 body wave 14.0 99.604162 2.865827 us usp000a6zz South Sandwich Islands region earthquake 9.909305 0.124552 2.000000 reviewed us us 2000-12-31

If we observe the place column, we can also see that values are inconsistent. We shall make it consistent to be either a State in the United States or the name of a Country.

In [24]:
# Extract the last word from the 'place' column using a regular expression
earthquake.place = earthquake.place.str.extract(r', (\w+[\s\w]*)$')

# Remove the word ' region' from the 'place' column, if present
earthquake.place = earthquake.place.str.replace(' region', '')

Now, we can also see that we have various U.S. states denoted by their abbreviations. We shall also rename them to match their actual state names.

In [25]:
# Extract non-null values from the 'place' column and filter those containing two uppercase letters using regex
earthquake.place.dropna()[earthquake.place.dropna().str.contains('^[A-Z]{2}$')].value_counts()
Out[25]:
place
CA    469
MX     41
AK     14
NV      2
OK      1
WA      1
Name: count, dtype: int64
In [26]:
# Mapping state/country abbreviations to full names
state_dict = {
    'CA': 'California',
    'MX': 'Mexico',
    'AK': 'Alaska',
    'NV': 'Nevada',
    'OK': 'Oklahoma',
    'WA': 'Washington'
}

# Replace abbreviations with full names
earthquake.place = earthquake.place.replace(state_dict, regex=True)

# Show changed 'place' column's value counts 
earthquake.place.value_counts()
Out[26]:
place
Indonesia           34160
Japan               27502
Papua New Guinea    15535
Chile               12921
Philippines         12195
                    ...  
Belgium                 1
Cameroon                1
Denmark                 1
Marshall Islands        1
Kentucky                1
Name: count, Length: 226, dtype: int64

Exploratory Data Analysis post Data Cleaning -¶

A crucial post-data cleaning step involves conducting an exploratory analysis of the dataset, enabling us to uncover key observations and gain insights that will guide the selection of appropriate methods to address our research questions.

In [27]:
#Exploratory Data Analysis
print("\nSummary Statistics:")

earthquake.describe()
Summary Statistics:
Out[27]:
latitude longitude depth mag nst gap dmin depthError magError magNst
count 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000 281363.000000
mean 4.931550 42.579743 82.893424 4.560977 60.608269 99.604162 2.865827 9.909305 0.124552 29.453822
std 28.767533 119.515324 131.235951 0.431818 60.679682 36.985163 1.584898 696.542434 0.046802 46.344296
min -84.422000 -179.999700 -3.290000 4.000000 0.000000 6.500000 0.000000 -1.000000 0.000000 0.000000
25% -17.388050 -71.024200 10.000000 4.300000 27.000000 73.400000 2.556000 4.400000 0.121000 7.000000
50% 1.304000 99.378000 33.630000 4.500000 60.608269 99.604162 2.865827 9.100000 0.124552 18.000000
75% 30.004700 141.927050 85.200000 4.700000 60.608269 126.000000 2.865827 9.909305 0.124552 29.453822
max 87.386000 180.000000 735.800000 9.100000 934.000000 313.000000 39.730000 367558.100000 1.642000 941.000000

Some of the observations here are :¶

  1. The average (mean) magnitude (mag) of earthquakes in this dataset is approximately 4.6, which suggests that most earthquakes in this set are of moderate intensity.
  2. The greatest depth at which an earthquake is recorded is 735.8 kilometers, indicating some very deep seismic activity, while the shallowest is at a depth of -3.29 kilometers, which may indicate an above-sea-level seismic event or a potential data error.
  3. The highest magnitude recorded is 9.1, which indicates an extremely powerful earthquake, given that the scale is logarithmic and each whole number step represents a tenfold increase in amplitude.
  4. There is significant variation in the 'depth' of earthquakes, as shown by the standard deviation (std) being around 130.8 kilometers.
  5. The 'magError', which is the uncertainty of the reported magnitude, has a minimum value of 0, suggesting that for some earthquakes, the magnitude was determined with high confidence.


Section 4 - Data Analysis :¶

As we embark on this investigative journey, our goal is to unravel the intricacies of earthquakes, paving the way for informed strategies to mitigate their impact. By addressing key questions, we seek to not only comprehend the nature of seismic events but also lay the groundwork for effective solutions.

Co-relation Analysis : Uncovering correlations between seismic attributes is a fundamental step in correlation analysis using heatmaps. Our comprehension of earthquake dynamics is aided by these graphic representations, which offer rapid insights into changeable linkages. By beginning with this analysis, strategies for resilience in seismic regions can be guided by identifying patterns and strengths in linkages.

Heatmaps visually represent the correlation matrix of seismic attributes, revealing patterns and insights. The color-coded gradients quickly identify strong or weak correlations, providing visual clarity.

In [28]:
eq_correlation_matrix = earthquake.corr(numeric_only=True)

# Creatin a heatmap using seaborn
plt.figure(figsize=(9, 6))
sns.heatmap(eq_correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5, mask=np.triu(eq_correlation_matrix))
plt.title('Correlation Heatmap for Earthquake')
plt.show()

Potential correlations between different seismic parameters, including gap, dmin (minimum distance to nearest station), rms (root mean square), and nst (number of reporting stations), will be revealed by the heatmap. These correlations may show trends in the relationships between various seismic attributes. We consider a few here with observation and drawing possible inferences from it:

  1. Magnitude-depth correlation:

Observation - The relationship between earthquake depth and magnitude is clearly inverse. Since there is a negative correlation between magnitude and depth, it can be inferred that deeper earthquakes typically have lower magnitudes while shallower earthquakes typically have higher magnitudes. This relationship makes sense in terms of general seismological knowledge.

Business Implication - Planning for Infrastructure Resilience : Magnitude and Depth of Correlation Enterprises, specifically those engaged in vital infrastructure (like energy, transportation, and utilities), might have to take into account the relationship between the depth and magnitude of earthquakes. Planning for infrastructure resilience could still be impacted by deeper earthquakes of possibly lesser magnitudes. The design and retrofitting of structures to withstand varied magnitudes and depths can be informed by an understanding of the characteristics of seismic events.

  1. Magnitude-Nst Relationship:

Observation - There might be a positive correlation between the number of reporting stations and the magnitude of earthquakes, according to the correlation matrix (MagNst). Seismic stations may report and pay more attention to earthquakes of a higher magnitude, resulting in a positive correlation between these two variables.

Business Implication - Risk assessment and insurance: When evaluating risk associated with seismic damage, insurance companies may take into account the relationship between the number of reporting stations and the magnitude of the earthquake. Greater potential for damage may be indicated by higher magnitude earthquakes that draw more reporting stations, which could have an impact on risk assessment models and insurance rates.


Trend in Earthquake Activity Over Time -¶

Let us we delve into the temporal aspects of earthquake occurrences to discern patterns and trends over time and address our first concern.

Q1. Is there a noticeable trend or pattern in earthquake activity over time? If so, what is causing the trend?

To gain comprehensive insights into the nature of earthquakes, we employ two pivotal visualizations: a line chart and a histogram. These will provide insights into the severity and frequency of earthquakes.

Line Chart: Frequency of Earthquakes Over Time

Similar to the above graph the line chart here will showcase the distribution of earthquake frequency over different time intervals. This analysis aims to uncover patterns in the occurrence of seismic events and identify potential clusters or spikes.

In [29]:
# create a new df for time series analysis
quake_ts = earthquake.copy() 

# Resetting the index only if it's already set
if not quake_ts.index.name == 'index':
    quake_ts.reset_index(drop=True, inplace=True)

# Create bins
bins = np.linspace(start=1999, stop=2023, num=9)

# Extract the year
quake_ts['year'] = quake_ts['time'].apply(lambda t: int(str(t).split('-')[0]))

# Create bins based on the 'bins' column
quake_ts['Year Bin'] = pd.cut(quake_ts['year'], bins)

# Get the counts for each bin
bin_counts = quake_ts['Year Bin'].value_counts().sort_index()

# Extract bin labels for x-axis
bin_labels = [f'{int(start)}-{int(stop)}' for start, stop in zip(bins[:-1], bins[1:])]

# Create marker labels as whole numbers
marker_labels = [f"{count:.0f}" for count in bin_counts]

# Create a Plotly line graph
fig = go.Figure()

fig.add_trace(go.Scatter(x=bin_labels, y=bin_counts, mode='lines+markers', marker=dict(size=8),
                         line=dict(shape='linear', color='blue'), text=marker_labels))

# Customize layout
fig.update_layout(title='Frequency of Earthquakes Over Time', title_x=0.5,
                  xaxis=dict(title='Year Bins'),
                  yaxis=dict(title='Number of Earthquakes'), height=600,
    autosize=True,  # Set autosize to True for width adjustment
)

# Show the plot
fig.show()

From this visualization, we can see a slightly increasing trend in the frequency of earthquakes over the 23 year period. This aligns with our expected findings. There are 2 causes for the increase in the number of earthquakes overtime.

  1. Advancements in earthquake detection technology : It is now easier than ever to detect earthquakes (Halton). As a result, the total number of earthquakes detected has increased.
  2. Climate Change : Research shows that climate change, marked by phenomena such as storms and glacial melting, contributes to increased seismic activity (Atmos). As the severity of global warming increases, the severity of these types of storms also increases in turn resulting in an increase in the number earthquakes. With these insights, the urgency to address climate challenges is highlighted.

Overall, the number of earthquakes over the last 20+ years has increased due to a combination of worsening climate change and advancements in earthquake detection technology. To read more about these topic refer to the citations at the end of this notebook.


Analyzing Types and Characteristics of Seismic Activities -¶

In this section of the analysis, we delve into the various types of seismic activities that have occured in the these years. We aim to understand the frequency and magnitude associated with each type of seismic event. By examining this information, we gain valuable insights into the diverse nature of seismic occurrences, providing a comprehensive overview of the seismic landscape. The exploration of seismic types, their frequency, and corresponding magnitudes contributes to a deeper understanding of the patterns and characteristics.

Box Plot : Magnitude and Frequency Distribution of Various Sesmic Events

The box plot serves as a powerful tool for comparing the magnitude distribution of various earthquake types, offering insights into their characteristic seismic strengths.

In [30]:
fig = px.box(earthquake, x='type', y='mag', color='type',
             labels={'type': 'Earthquake Type', 'mag': 'Magnitude'})

# Remove legend
fig.update_traces(showlegend=False)

# Adjust the size and layout of the plot
fig.update_layout(
    height=600,
    autosize=True,  # Set autosize to True for width adjustment
    plot_bgcolor='rgba(192, 192, 192, 0.25)', 
    title='Box Plot of Magnitudes for Different Earthquake Types', title_x=0.5
)

fig.show()

Business Usage of the box-plot :¶

Distribution of Magnitude and Central Tendency - Explosions are observed to have a lower median magnitude than earthquakes. Knowledge of the typical magnitude distribution can impact risk assessments and safety procedures for companies operating in areas where industrial activities or explosions are frequent. Depending on the anticipated seismic impact, it might be essential to modify emergency response plans and infrastructure.

Range Variability in Magnitude - There is more variation in the magnitude range of earthquakes and explosions, according to the box plot. Companies may need to take the possible variability in seismic events into account, especially those in the building, infrastructure, and real estate sectors. This knowledge can improve overall resilience by influencing the design and construction of structures to withstand a variety of magnitudes.

Planning for Infrastructure and Resilience - The need for adaptable infrastructure planning is suggested by the variation in magnitude distributions among the various types. Companies engaged in the planning of critical infrastructure, like those in the energy and utility sectors, may need to create systems and structures that take into consideration the wide range of seismic magnitudes. Because of its adaptability, overall resilience is improved and vital services are maintained through a range of seismic events.


Mining/Blasting Impact on Earthquakes -¶

We now turn our attention to the potential influence of human activities, specifically mining and blasting, on seismic events. The objective is to scrutinize whether these activities correlate with an increase in the frequency and severity of earthquakes.

Q2. Has mining/blasting had any impact on the frequency and/or severity of sesmic events? If so, what is the relationship?

Hypothesis: We hypothesize a positive correlation between mining/blasting activities and seismic events. Limitation: Due to our dataset focusing on earthquakes with a magnitude greater than 4, it's essential to acknowledge potential underrepresentation of lower-magnitude events induced by mining.

Moving forward, we explore the potential adverse impacts of nuclear power plants on earthquake activity, seeking to unveil any discernible relationships.

Expectation: Anticipating a link between nuclear plant explosions and increased seismic activity. Complexity: Understanding the intricate relationship between nuclear activities and the magnitude of resulting earthquakes.

In [31]:
# Define URLs for mining and nuclear icons 
nuclear_icon = 'https://em-content.zobj.net/source/google/387/rocket_1f680.png'
mining_icon = 'https://em-content.zobj.net/source/samsung/380/collision_1f4a5.png'

# Filter man-made events from the earthquake DataFrame
manmade = earthquake[earthquake.type.str.contains('mining|mine|^explosion$|nuclear')]
manmade.type.value_counts()

# Initialize a folium map
world_map = folium.Map(location=[0, 0],
                       zoom_start=2, 
                       attr='Mapbox',
                       max_bounds=True, 
                       tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
                    )

# Define legend with the icon URLs
legend_items = {
    'Nuclear Explosions': nuclear_icon,
    'Mining Explosions': mining_icon,
}

# Create a legend using HTML
legend_html = '''
     <div style="position: fixed; 
                 bottom: 40px; left: 50px; width: 175px; height: 70px; 
                 border:2px solid grey; z-index:9999; font-size:14px;
                 background-color:white; text-align: center;">
     '''

# Populate the legend with items
for event_type, image_url in legend_items.items():
    legend_html += f'''
         <img src="{image_url}" style="height:30px;width:30px;"> {event_type}<br>
     '''
legend_html += '''
     </div>
     '''

# Add the legend to the map
world_map.get_root().html.add_child(folium.Element(legend_html))

# Plot markers for man-made events on the map
for index, row in manmade.iterrows():
    lat, long = row['latitude'], row['longitude']
    
    # Choose the appropriate image based on event type
    if 'nuclear' in row.get('type', '').lower():
        image_url = nuclear_icon
    else:
        image_url = mining_icon
    
    # Extract event details
    event_date = row['date']
    event_mag = row['mag']
    
    # Populate tooltip content
    tooltip_data = f'Event Date: {event_date}<br>Magnitude: {event_mag}'
    
    # Set icon size for markers
    icon = folium.CustomIcon(image_url, icon_size=(30, 30))
    
    # Add markers to the map
    folium.Marker(location=[lat, long], icon=icon, tooltip=tooltip_data).add_to(world_map)

# Add title to the map
title_html = '''
             <h3 align="center" style="font-size:20px"><b>Map of Mining and Nuclear Explosions</b></h3>
             '''
world_map.get_root().html.add_child(folium.Element(title_html))

# Display the map
world_map
Out[31]:
Make this Notebook Trusted to load map: File -> Trust Notebook

In the interactive map above, it can be observed by zooming in that there is a concentration of explosions in Wyoming. Upon further investigation, it was discovered that these explosions all occurred due to mining. This is likely because Wyoming has been the top producer of coal in the United States since 1986, producing more than 40% of annual US coal supply through mining (Wyoming State Geological Survey, n.d.).

Mining explosions in Wyoming history are far from uncommon. Throughout the 1800s to late 1900s, multiple fatal mining accidents occurred killing many. Most notably, over 100 miners lost their lives in the Hanna mine explosion in 1903 (Rea, 2014).

However, our map only shows explosions in the 21st century. Due to the more current nature of our data, we can conclude that due to increased safety measures the severity of mining explosions has decreased as compared to the 1900s. This decrease is also likely due to a shift away from the use of coal in the electric power industry. Overall, this shift away from coal is a positive shift in the context of seismic activity due to the potential disastrous effects of seismic activity that is caused by mining.

Another area of concentration on the map above is in North Korea. Six nuclear explosions between 2006 and 2017 occurred in North Korea. Upon further investigation, it became clear that all 6 of the occurrences were the result of nuclear weapons testing (BBC.). This discovery is a concern for any government that plans to conduct nuclear missile testing because any missile that is tested will cause seismic activity.


In this segment of our analysis, we try to understand the critical question:

Q3. Has there been any adverse impact of earthquake activity on nuclear power plants? If so, what is the relationship?

This inquiry is paramount for understanding the potential implications of nuclear power generation on seismic events. By examining data and exploring patterns, we aim to shed light on whether earthquakes contribute to or influence nuclear power plants. This investigation is crucial for ensuring the safety and resilience of regions with nuclear facilities and informs strategies for risk mitigation in the context of seismic events.

For this, we are considering a new dataset that provides information on nuclear power plants worldwide, sourced from reputable outlets like Declan Butler of Nature News and the International Atomic Energy Agency's Power Reactor Information Systems. The data includes essential details like plant location (Longitude, Latitude), region, country, number of reactors, and the affected population. Our analysis will concentrate on the plant's name, longitude, and latitude to compare it to get any relationship it has with earthquakes.

In [32]:
nuclear = pd.read_csv('energy-pop-exposure-nuclear-plants-locations_plants.csv')
nuclear.head()
Out[32]:
FID Region Country Plant NumReactor Latitude Longitude p90_1200 p00_1200 p10_1200 ... p10r_600 p90_300 p00_300 p10_300 p90u_300 p00u_300 p10u_300 p90r_300 p00r_300 p10r_300
0 0 Europe - Western SWEDEN AGESTA 1 59.206022 18.082872 187382000 188684000 188250000 ... 8972550.0 5013240.0 5227700.0 5471110.0 3426880.0 3596030.0 3764920.0 1586350.0 1631670.0 1706190
1 1 Europe - Western SPAIN ALMARAZ 2 39.808100 -5.696940 136675000 147718000 163429000 ... 19453700.0 17756500.0 18187800.0 20185200.0 10986000.0 11415400.0 12689800.0 6770480.0 6772380.0 7495340
2 2 America - Latin BRAZIL ANGRA 3 -23.007857 -44.458098 99195200 113894000 127898000 ... 18605600.0 39546400.0 44701700.0 50210600.0 32788500.0 37064600.0 41648300.0 6757940.0 7637110.0 8562300
3 3 America - Northern UNITED STATES OF AMERICA ARKANSAS ONE 2 35.310320 -93.231289 117830000 132729000 146482000 ... 9498240.0 5603180.0 6226360.0 6866840.0 3779400.0 4198920.0 4633770.0 1823770.0 2027450.0 2233070
4 4 Europe - Western SPAIN ASCO 2 41.200000 0.566670 271854000 287134000 308922000 ... 17594700.0 14398500.0 15095600.0 16830700.0 11215400.0 11773600.0 13151300.0 3183180.0 3322050.0 3679470

5 rows × 61 columns

In [33]:
eq_nuclear = earthquake[earthquake.mag >= 6]
eq_nuclear.sort_values(by='mag', ascending=False)

# Initialize the folium map
reactor_map = folium.Map(location=[35, 139], zoom_start=5,
                         tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
                            attr='Mapbox', max_bounds=True)  

# Function to categorize the earthquake colour on map based on magnitude
def earthquake_map_colors(magnitude):
    if magnitude >= 8.0:
        return '#ff9999', '#ff0000'  # Red shade
    elif magnitude >= 7.0:
        return '#ffd699', '#ff8c00'  # Orange shade
    elif magnitude >= 6.0:
        return '#ffff99', '#ffff00'  # Yellow shade

# Function to categorise the radius of impact based on the magnitude (Returns the value in meters)
def impact_radius_kms(magnitude):
    if magnitude >= 8.0:
        return 500000  
    elif magnitude >= 7.0:
        return 250000  
    elif magnitude >= 6.0:
        return 100000  
    else:
        return 0 

# Function to check if an earthquake's impact radius contains nay nuclear plant
def nuclear_plant_vicinity_check(earthquake, radius, plants):
    nearby_plants = []
    for _, plant in plants.iterrows():
        # Using geodesic and feeding two sets of locations to find the distance between them
        distance = geodesic((earthquake['latitude'], earthquake['longitude']), (plant['Latitude'], plant['Longitude'])).kilometers
        
        # If the the nuclear plant is within 1000 kms range we add the plant to our interest list
        if distance <= radius / 1000:
            nearby_plants.append(plant)
    return nearby_plants


# Passing the highest magnitude earthquake for each location in a dictionary
highest_magnitude_earthquakes = {}

# Passing all nearby reactors to a list
all_nearby_plants = []

# Using the earthquake data to start our iteration
for _, eq in eq_nuclear.iterrows():
    
    # Passing the location as Tuple
    location = (eq['latitude'], eq['longitude'])
    impact_radius = impact_radius_kms(eq['mag'])
    nearby_plants = nuclear_plant_vicinity_check(eq, impact_radius, nuclear)

    # Looking for the highest magnitude earthquake at every location
    if nearby_plants:
        if (location not in highest_magnitude_earthquakes or eq['mag'] > highest_magnitude_earthquakes[location]['mag']):
            highest_magnitude_earthquakes[location] = eq
            all_nearby_plants.extend(nearby_plants)  # Add the point of interest plants to the list

# Sorting the earthquakes and keeping the highest magnitudes on the top
sorted_earthquakes = sorted(highest_magnitude_earthquakes.values(), key=lambda x: x['mag'], reverse=False)

# Plotting the values on the map
for eq in sorted_earthquakes:
    impact_radius = impact_radius_kms(eq['mag'])
    shadow_color, marker_color = earthquake_map_colors(eq['mag'])

    # Mapping the Impact circle
    folium.Circle(
        location=[eq['latitude'], eq['longitude']],
        radius=impact_radius,
        color=shadow_color,
        fill=True,
        fill_opacity=0.3,
        weight=0
    ).add_to(reactor_map)

    # Plotting the earthquake hover data
    folium.CircleMarker(
        location=[eq['latitude'], eq['longitude']],
        radius=4,
        color=marker_color,
        fill=True,
        fill_color=marker_color,
        tooltip=f'Earthquake: Mag {eq["mag"]} <br>Date: {eq["date"]}'
    ).add_to(reactor_map)

# Fetching all the unique nuclear plants in the range of earthquake
unique_affected_nuclear_plants = {plant['Plant']: plant for plant in all_nearby_plants}.values()

# Plot only the affected nuclear plants with a custom icon
icon_url = r'https://i.imgur.com/i4sPkgI.png'  # Passing the icon url that points each reactor on map
for plant in unique_affected_nuclear_plants:
    icon = folium.CustomIcon(icon_url, icon_size=(20, 20))  # Preparing a custom sized icon
    folium.Marker(
        location=[plant['Latitude'], plant['Longitude']],
        icon=icon,
        tooltip=f'Nuclear Plant: {plant["Plant"]}'
    ).add_to(reactor_map)
    


legend_html = '''
<div style="position: fixed; 
 bottom: 50px; left: 50px; width: 150px; height: auto; 
 border:2px solid grey; z-index:9999; font-size:14px;
 background: white; padding: 5px; border-radius: 6px;
 box-shadow: 0 0 10px rgba(0,0,0,0.5);">
 <b>Earthquake Magnitude</b> <br>
 &nbsp; <span style="height:10px;width:10px;background-color:#ff9999;display:inline-block;border-radius:50%;"></span>&nbsp; >= 8.0<br>
 &nbsp; <span style="height:10px;width:10px;background-color:#ffd699;display:inline-block;border-radius:50%;"></span>&nbsp; 7.0 - 7.9<br>
 &nbsp; <span style="height:10px;width:10px;background-color:#ffff99;display:inline-block;border-radius:50%;"></span>&nbsp; 6.0 - 6.9<br>
 <b>Nuclear Power Plant</b><br>
 &nbsp; <img src="https://i.imgur.com/i4sPkgI.png" style="width: 20px; height: 20px;"/> &nbsp; Nuclear Plant<br>
</div>
'''

# Add title to the map
title_html = '''
             <h3 align="center" style="font-size:20px"><b>Map of Nuclear Power Plants within the impact radius of Earthquakes</b></h3>
             '''
reactor_map.get_root().html.add_child(folium.Element(title_html))

# Add the legend to the map
reactor_map.get_root().html.add_child(folium.Element(legend_html))

reactor_map
Out[33]:
Make this Notebook Trusted to load map: File -> Trust Notebook

The map below shows nuclear power plants that fall within an impact radius of an earthquake. The impact radius shows the area an earthquake reaches based on its magnitude. For instance, a higher magnitude earthquake will reach a greater geographical area and thus have a bigger impact radius and a bigger circle on the graph. The opaque circles on the graph represent the impact radius and the hollow circles represent the earthquake itself. The colors represent the severity of the magnitude of the earthquake. For example, the red circles indicate higher magnitude and the yellow circles represent lower magnitude. By hovering over the earthquakes, the magnitude and year of the earthquake can be seen.

Now that the map has been created and explained, more in depth analysis can be done to gain insights on the research question at hand.

In the map above, it can be observed that the highest concentration of nuclear power plants is in Japan. This is becuase nuclear energy has been a national strategic priority for Japan since 1973 (World Nuclear Association, 2019). From our 'Earthquake Severity Around the World' map, it was already observed that Japan is prone is more prone to earthquakes as compared to other areas of the world.

From these 2 observations regarding Japan, the question arises: Do earthquakes in Japan, an earthquake prone country, have an impact on the nuclear power plants in Japan?

To answer this question, further research has been done on the earthquakes and power plants in Japan.

The most notable data point in Japan on this graph is the magnitude 9 earthquake. Upon further research, it became clear that this is the Great East Japan Earthquake. This earthquake caused 11 reactors at 4 power plants to shutdown automatically in accordance to emergency protocol (Nuclear Power Plants and Earthquakes - World Nuclear Association, 2021). However, at the Fukushima-Daiichi plant (view the graph above) an accident commenced due to a tsumani caused by earthquake. The power plant emergency protocols were not designed to withstand this type of natural disaster and as a result the entire accident was rated at a level 5 out of 7 on the International Nuclear and Radiological Event Scale (Nuclear Power Plants and Earthquakes - World Nuclear Association, 2021). The INES is a scale used for communicating the saftey significance of nuclear and radiological events.

This short case study on Japanese nuclear power plants reveals that building nuclear power plants in earthquake prone areas is a signifcant risk. Despite the protocols in place at the Fukushima-Daiichi plant, a level 5 accident still occurred.

Although many of the nuclear power plants on the above map have not been researched for the purposes of this analysis, an argument can be made that these nuclear power plants should ensure that proper emergency protocols are in place in the event that more earthquakes occur near the plant.


Unveiling Earthquake Vulnerabilities: Prioritizing Preparedness and Resources¶

As we navigate through the intricacies of seismic events, our focus shifts to understanding the regions most susceptible to severe earthquakes.

Q4. Which regions of the world suffer most from severe earthquakes and how can we use this information to prioritize earthquake drills and resource allocation?

By pinpointing the regions that suffer most from severe earthquakes, we aim to enhance our preparedness strategies, ensuring that communities in high-risk zones are well-equipped to respond to and recover from seismic incidents. This analysis serves as a crucial foundation for developing targeted plans and initiatives that contribute to the safety and resilience of these earthquake-prone regions.

Interactive Map : Earthquake Severity Around the World

In [34]:
# Function to assign color based on earthquake magnitude
def magnitude_color(magnitude):
    if 7 <= magnitude < 7.5:
        return 'beige'
    elif 7.5 <= magnitude < 8:
        return 'orange'
    elif 8 <= magnitude < 8.5:
        return 'red'
    elif 8.5 <= magnitude < 9:
        return 'darkred'
    else:  # magnitude 9 and above
        return 'purple'

# Here we are filtering the dataset for earthquakes with a magnitude of 7 or higher
severe_earthquakes_df = earthquake[earthquake['mag'] >= 7]

# Initialize map with a tile layer that uses English place names - 
earthquake_map = folium.Map(location=[severe_earthquakes_df['latitude'].mean(), severe_earthquakes_df['longitude'].mean()],
                            tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
                            attr='Mapbox',
                            zoom_start=2, max_bounds=True)

# Create a MarkerCluster object
marker_cluster = MarkerCluster().add_to(earthquake_map)

# This function creates a custom popup marker - 
def create_popup(row):
    iframe = IFrame(f'<b>Magnitude:</b> {row["mag"]}<br>'
                    f'<b>Depth:</b> {row["depth"]} km<br>'
                    f'<b>Location:</b> {row["place"]}',
                    width=200, height=100)
    return folium.Popup(iframe, max_width=300)

# Add markers to the cluster
for index, row in severe_earthquakes_df.iterrows():
    folium.Marker(
        location=[row['latitude'], row['longitude']],
        icon=folium.Icon(color=magnitude_color(row['mag']), icon='info-sign'),
        popup=create_popup(row)
    ).add_to(marker_cluster)

# Add Custom Legend as an HTML element
legend_html = '''
     <div style="position: fixed; 
     bottom: 50px; left: 50px; width: auto; height: auto; 
     border:2px solid grey; z-index:9999; font-size:14px;
     background: #ffffff69; padding: 5px; border-radius: 6px;
     box-shadow: 0 0 10px rgba(0,0,0,0.5);">
     &nbsp; <b>Earthquake Magnitude</b> <br>
     &nbsp; <i class="fa fa-circle" style="color:beige"></i>&nbsp; 7 - 7.5<br>
     &nbsp; <i class="fa fa-circle" style="color:orange"></i>&nbsp; 7.5 - 8<br>
     &nbsp; <i class="fa fa-circle" style="color:red"></i>&nbsp; 8 - 8.5<br>
     &nbsp; <i class="fa fa-circle" style="color:darkred"></i>&nbsp; 8.5 - 9<br>
     &nbsp; <i class="fa fa-circle" style="color:purple"></i>&nbsp; >= 9
      </div>
     '''
earthquake_map.get_root().html.add_child(folium.Element(legend_html))

# Display the map
earthquake_map
Out[34]:
Make this Notebook Trusted to load map: File -> Trust Notebook

The map highlights concentrated seismic activity with a particular emphasis on earthquakes that have a magnitude of 7 or higher, notably along the western coasts of North and South Americas and the Southeast Asia belt extending from Japan to New Zealand.

  • Subduction Zone : Subduction zones occur when one tectonic plate is forced beneath another.

  • Pacific Ring of Fire: The horseshoe-shaped Pacific Ring of Fire, encompassing 75% of the world's active volcanoes, exhibits frequent earthquakes along a path from South America to Japan and New Zealand. This dynamic region is characterized by the convergence of major tectonic plates.

  • Seismic Belt : The region from New Zealand to Japan linked to the Pacific Ring of Fire, experiences significant seismic activity due to subduction of oceanic plates beneath continental plates.

  • Ring of Fire Overview : The Pacific Ring of Fire encircles the Pacific Ocean basin and is marked by the convergence of several major tectonic plates, including the Pacific Plate, North American Plate, South American Plate, Eurasian Plate, and Indo-Australian Plate. The interaction of these plates, particularly through subduction zones, results in frequent seismic and volcanic events.

  • Global Impact : Notable earthquakes, such as the 2011 Tohoku quake in Japan and the 2004 Indian Ocean event, are associated with the Ring of Fire. This region's seismic and volcanic intensity has global implications, affecting nearby and distant areas alike.


Let's consider the measures taken by Japan, Indonesia, Philippines, and Chile in the context of earthquake mitigation measures:

Countries that are vulnerable to earthquakes, such as Chile, Indonesia, Japan, and the Philippines, have put in place a number of safeguards to lessen the effects of seismic activity on their people, economy, and infrastructure. Along with possible additional actions, these nations have either already taken or are considering the following corrective measures:

Common Factors to Take Into Account for Every Nation:¶

Let's consider the measures taken by Japan, Indonesia, Philippines, and Chile in the context of earthquake mitigation measures:

Countries vulnerable to earthquakes, including Chile, Indonesia, Japan, and the Philippines, have implemented safeguards to mitigate seismic impacts on people, economies, and infrastructure. Alongside existing measures, these nations have taken or are considering the following actions:

Common Factors:¶
  1. Investment in R&D:
  • Current: Investment levels may vary based on economic conditions.
  • Additional Measures: Prioritize research and innovation through international collaboration, focusing on cutting-edge earthquake-resistant technologies. Seek private sector funding to advance technology.
  1. International Collaboration:
  • Current: Robust economies support proactive global cooperation.
  • Additional Measures: Foster a global community for earthquake resilience through sustained collaborations, sharing resources, best practices, and research. Promote international programs addressing common issues in earthquake-prone regions.
  1. Insurance and Risk Management:
  • Current: Adoption of earthquake insurance may be influenced by economic conditions.
  • Additional Measures: Develop and promote advanced insurance products, explore global risk management partnerships, and ensure widespread access to insurance for homes and businesses.
  • Promote collaborations with the insurance sector to establish risk-sharing arrangements.

  1. Japan : (culturetrip, 2018)

Economy : Japan boasts one of the world's largest and most advanced economies.

Earthquake Measures :

  • Japan, situated in a seismically active region, employs advanced earthquake mitigation measures.
  • An early warning system provides alerts minutes or hours before earthquakes, allowing for preparedness.
  • Stringent building codes ensure structures are earthquake-resistant, prioritizing safety during seismic events.

Suggestions :

  • Continued funding for seismic research and technology can enhance forecasting and early warning systems.
  • Prioritize the resilience of vital infrastructure to minimize disruptions during and after earthquakes.
  • Collaborate with neighboring nations to establish cooperative early warning systems for regional disaster preparedness.
  • Japan must maintain a holistic and adaptive approach to seismic risk mitigation, considering the dynamic nature of natural hazards and the built environment.

  1. Indonesia : (Indonesia Tsunami Early Warning System (InaTEWS) | Department of Economic and Social Affairs, n.d.)

Economy : Classified as a developing economy, Indonesia faces resource constraints despite ongoing expansion.

Earthquake Measures :

  • Indonesia, in a seismically active region, employs an early warning system for earthquakes and tsunamis, using seismographic and tidal stations.
  • Public awareness campaigns inform communities about earthquake and tsunami preparedness, emphasizing safe havens and evacuation routes.

Suggestions :

  • Invest in resilience and retrofitting of vital infrastructure, such as ports and transportation networks, considering Indonesia's archipelagic nature.
  • Diversify the economy to reduce sensitivity to seismic events, especially in sectors like tourism.

Indonesia must regularly update its strategies, considering the evolving seismic risks and changing infrastructure and community landscape.


  1. Philippines : (PHIVOLCS Staff, 2016)

Economy : Positioned as a developing economy, the Philippines has a broad economic foundation and is in a state of development.

Earthquake Measures :

  • The Philippines, located in a seismically active region, has implemented measures, including the Philippine Institute of Volcanology and Seismology (PHIVOLCS), which operates an early warning system for earthquakes and volcanic activities.
  • Public awareness campaigns educate communities on earthquake preparedness through drills, emphasizing emergency kits and evacuation procedures.

Suggestions :

  • Prioritize readiness for volcanic hazards, integrating evacuation planning and early warning systems for volcanic events.
  • Integrate seismic resilience with climate change adaptation, considering shifting rainfall patterns and rising sea levels in infrastructure planning.

A comprehensive approach will enhance the Philippines' resilience to seismic risks, ensuring the safety of its people, economy, and infrastructure in the face of disasters and climate challenges.


  1. Chile : (Earthquake and Tsunami in Chile: Massive Evacuation and Building Codes to Reduce Loss of Life, 2023)

Economy : A stable and diversified economy makes Chile one of the most prosperous countries in South America.

Earthquake Measures :

  • Chile, situated along the Pacific Ring of Fire, has implemented measures, including the Sistema de Alerta de Emergencia (SAE) for early earthquake warnings.
  • Strict building codes ensure structures follow seismic-resistant design standards, enhancing endurance during earthquakes.

Suggestions :

  • Integrate seismic resilience into urban planning, emphasizing green spaces and restricting development in high-risk areas.
  • Enhance infrastructure resilience, including energy grids and transportation networks, through planning and retrofitting.

Chile can strengthen its resilience, protect its population, and promote sustainable development by continually adapting to the dynamic nature of seismic hazards.



Section 5 - Conclusion :¶

In summary, our project thoroughly examined seismic activity from 2000 to 2023. Firstly, we found that there hasn't been a clear pattern in earthquake magnitudes throughout the 20th century, except for a spike in 2010 following the devastating earthquake in Haiti. We realized that this was not significant and hence we have chosen to not visualize the same. The number of earthquakes recorded has increased over 23 years, likely due to improved earthquake detection technology and climate change. Regarding the impact of mining and blasting, we noticed a high number of seismic events in Wyoming, a top coal producer in the US since 1986. While major explosions are rare nowadays, ongoing safety measures in mines are crucial. In response to the question about earthquakes affecting nuclear power plants, we highlighted the risk in countries like Japan with frequent earthquakes and numerous nuclear plants. The Fukushima-Daiichi incident in Japan in 2011 highlighted the potential dangers. We recommend avoiding placing nuclear plants in earthquake-prone areas to prevent significant harm. Lastly, we identified Japan, Indonesia, the Philippines, and Chile as regions with the most severe earthquakes. Our suggestions for these nations include increased funding for earthquake research and development, international collaboration to build global resilience, and the adoption of earthquake insurance and risk management.



Section 6 - References :¶

  1. USGS. “Search Earthquake Catalog.” Usgs.gov, 2019, earthquake.usgs.gov/earthquakes/search/

  2. Earthquakes in a Warming World. Atmos. (2023, September 7). https://atmos.earth/earthquakes-in-a-warming-world/#:~:text=Today%2C%20earthquakes%20are%20becoming%20more

  3. Halton, Mary. “Revolution in Quake Detection Technology.” BBC News, 4 July 2018, www.bbc.com/news/science-environment-44683284

  4. Wyoming State Geological Survey. (n.d.). Www.wsgs.wyo.gov. https://www.wsgs.wyo.gov/energy/coal.aspx

  5. Rea, T. (2014, November 8). Thunder under the House: One Family and the Hanna Mine Disasters | WyoHistory.org. Www.wyohistory.org. https://www.wyohistory.org/encyclopedia/thunder-under-house-one-family-and-hanna-mine-disasters#:~:text=Between%201912%20and%201938%2C%20160

  6. BBC. (2019, February 25). North Korea’s missile and nuclear programme. BBC News. https://www.bbc.com/news/world-asia-41174689

  7. World Nuclear Association. (2019). Nuclear Power in Japan | Japanese Nuclear Energy - World Nuclear Association. World-Nuclear.org. https://world-nuclear.org/information-library/country-profiles/countries-g-n/japan-nuclear-power.aspx

  8. Nuclear Power Plants and Earthquakes - World Nuclear Association. (2021, March). World-Nuclear.org. https://world-nuclear.org/information-library/safety-and-security/safety-of-plants/nuclear-power-plants-and-earthquakes.aspx

  9. culturetrip. (2018, January 10). Number of Ways Japan Prepares for Earthquakes. Culture Trip. https://theculturetrip.com/asia/japan/articles/8-ways-japan-prepares-for-earthquakes

  10. Indonesia Tsunami Early Warning System (InaTEWS) | Department of Economic and Social Affairs. (n.d.). Sdgs.un.org. https://sdgs.un.org/partnerships/indonesia-tsunami-early-warning-system-inatews

  11. PHIVOLCS Staff. (2016). Earthquake Monitoring. Dost.gov.ph. https://www.phivolcs.dost.gov.ph/index.php/earthquake/earthquake-monitoring

  12. Earthquake and Tsunami in Chile: massive evacuation and building codes to reduce loss of life. (2023). Unesco.org. https://www.unesco.org/en/articles/earthquake-and-tsunami-chile-massive-evacuation-and-building-codes-reduce-loss-life#:~:text=Chile%20is%20actively%20involved%20in

In [ ]:
 
In [ ]: